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Abstract 

Background: Human aneuploidy is the leading cause of early pregnancy loss, mental retardation, and multiple 
congenital anomalies. Due to the high mortality associated with aneuploidy, the pathophysiological mechanisms of 
aneuploidy syndrome remain largely unknown. Previous studies focused mostly on whether dosage compensation 
occurs, and the next generation transcriptomics sequencing technology RNA-seq is expected to eventually uncover 
the mechanisms of gene expression regulation and the related pathological phenotypes in human aneuploidy. 

Results: Using next generation transcriptomics sequencing technology RNA-seq, we profiled the transcriptomes of 
four human aneuploid induced pluripotent stem cell (iPSC) lines generated from monosomy x (Turner syndrome), 
trisomy 8 (Warkany syndrome 2), trisomy 13 (Patau syndrome), and partial trisomy 1 1:22 (Emanuel syndrome) as well 
as two umbilical cord matrix iPSC lines as euploid controls to examine how phenotypic abnormalities develop with 
aberrant karyotype. A total of 466 M (50-bp) reads were obtained from the six iPSC lines, and over 13,000 mRNAs were 
identified by gene annotation. Global analysis of gene expression profiles and functional analysis of differentially 
expressed (DE) genes were implemented. Over 5000 DE genes are determined between aneuploidy and euploid iPSCs 
respectively while 9 KEGG pathways are overlapped enriched in four aneuploidy samples. 

Conclusions: Our results demonstrate that the extra or missing chromosome has extensive effects on the whole 
transcriptome. Functional analysis of differentially expressed genes reveals that the genes most affected in 
aneuploid individuals are related to central nervous system development and tumorigenesis. 



Background 

Aneuploidy, an abnormal number of chromosomes in 
humans, is the result of a gain or loss of a chromosome 
during cell division. Human aneuploidy was first discov- 
ered in 1959 by Lejeune and colleagues through monos- 
omy X, also known as Turner syndrome [1]. This is the 
leading cause of early pregnancy loss, mental retardation, 
and multiple congenital anomalies [2] . Among first trime- 
ster abortions, lethality due to aneuploidy is greater than 
that from all other causes combined [3]. Scientists have 
always been interested in determining how aneuploidy 
affects a fetus, and the molecular mechanisms of this 
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condition have been studied for a long time. However, the 
high mortality rate associated with aneuploidy limits the 
capability to study aneuploidy syndromes systematically. 
As a result, most of the published gene expression studies 
of human aneuploidy involved patients and/or mouse 
models of Down syndrome [4-11]. 

In recent years, several aneuploid human embryonic 
stem cell (ESC) lines have been established as models for 
studying human aneuploidy syndromes [12-15], which has 
expanded the scope of aneuploidy research, leading to 
investigations of other syndromes caused by the gain or 
loss of a chromosome. Compared with ESC models, 
induced pluripotent stem cell (iPSC) models, the success- 
ful reprogramming of differentiated human somatic cells 
into a pluripotent state, can be applied to more easily 
study human disease [16]. Recently, two laboratories 
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generated iPSCs from patients with an aneuploid syn- 
drome [17,18]. iPSCs were shown to stably maintain the 
karyotype of the donors and to behave like ESCs [17]. Due 
to the outstanding performance of the RNA-seq method, 
analyzing the gene expression profiles of these aneuploid 
iPSCs will provide a great way to understand the patholo- 
gical mechanisms of human aneuploidy. 

Several recent studies based on DNA microarray techni- 
ques concluded that an extra or missing chromosome may 
have a major effect on gene expression on the particular 
chromosome but only a minor effect on the whole 
transcriptome [4,8,9,19]. Conversely, some other studies 
suggested the extra or missing chromosome has a global 
effect on the whole transcriptome that is regulated by 
dosage compensation [20,21]. Dosage compensation is a 
process that mainly restores gene dosage to a balanced 
level between x chromosome and autosomes in mammals 
and has been reported in an aneuploid condition [22]. 
With the influence of dosage compensation, some genes 
on the extra or missing chromosome will have no change 
in gene product levels compared with disomic controls 
[23]. However, Xiong and colleagues found there is no 
dosage compensation of the active x chromosome and 
revised the current model of dosage compensation with 
RNA sequencing, revealing that with application of next 
generation sequencing technologies, the mechanism of 
gene expression regulation and its related pathological 
phenotypes in human aneuploidy eventually can be 
discovered [24-26]. 

To examine how phenotypic abnormalities develop with 
aberrant karyotype, we profiled the transcriptomes of four 
human iPSC lines by RNA-seq technology on a next gen- 
eration sequencing platform. The four iPSC lines were 
generated from monosomy x (Turner syndrome), trisomy 8 
(Warkany syndrome 2), trisomy 13 (Patau syndrome), and 



partial trisomy 11:22 (Emanuel syndrome), which are 
seldom associated with postnatal survival. We compared 
the gene expression profiles of the four aneuploid iPSCs 
with those of two iPSCs generated from umbilical cord 
matrix cells (UMCs) as euploid controls and attempted to 
discover how the extra or missing chromosome affects the 
human transcriptome and the specific transcriptional 
changes caused by dosage imbalance. Functional analysis 
of differentially expressed (DE) genes allowed us to deter- 
mine the significance of several processes in aneuploidy 
during embryonic development. The aim of this study was 
to explain how aneuploidy disrupts fetal development and 
contributes to phenotypic variations in order to better 
understand the molecular etiopathology of aneuploidy. 

Results 

SOLiD transcriptome sequencing of aneuploid and 
euploid iPSCs 

We generated a highly detailed transcriptome profile for 
four aneuploid iPSC and two euploid iPSC clones using 
RNA-seq. The creation of iPSCs from UMCs is easy to 
achieve and produces large numbers of cells that escape 
acquired somatic cell mutations, which were applied as 
euploid controls (UMC1 and UMC6). All transcriptome 
libraries were generated and sequenced on a SOLiD v3 
platform (Applied Biosystems, Foster City, CA, USA). We 
obtained 59.5 M and 58.2 M (50-bp) reads from the two 
UMC samples and 83.1 ~ 90.7 M (50-bp) reads from the 
four aneuploid samples. 

Sequenced reads were mapped onto the human genome 
(hgl9) using Corona Lite (See Methods, detailed mapping 
results are given in Table 1A). Approximately 41-47% of 
reads from the four aneuploid iPSC lines were uniquely 
mapped onto the reference genome, compared to only 
24% and 33% reads uniquely mapped in euploid controls. 



Table 1 Statistical information of RNA-seq mapping result. 



A. Number of reads in each cell line 



Sample 


Total reads 


Total Mapped 


Percent 


Unique Mapped 


Percent 




UMC1 


59,478,926 


19,119,147 


32.14% 


14,256,083 


23.97% 




UMC6 


58,161,509 


26,559,086 


45.66% 


19,416,912 


33.38% 




T8 


86,659,524 


46,672,781 


53.86% 


36,393,943 


42.00% 




T13 


90,676,957 


49,763,484 


54.88% 


38,656,136 


42.63% 




T22 


83,120,658 


49,582,084 


59.65% 


38,498,635 


46.32% 




XO 


88,052,908 


46,906,900 


53.27% 


36,512,405 


41 .47% 






B. Gene expression levels. More than 8i 


0% genes are moderately expressed. 






Expression Level (RPKM) 


UMC1 


UMC6 


T8 


T13 


T22 


XO 


Low 0.3-1 


1755 


1762 


1817 


1831 


1933 


1850 


Medium 1-100 


11350 


11270 


11548 


11534 


11041 


11060 


High > 100 


299 


331 


281 


255 


250 


295 


Total 


13404 


13363 


13646 


13620 


13224 


13205 


A. Number of reads (total reads, mapped reads and unique mapped reads) shows 


in every cell lines. E 


i. Distribution of all expressed genes among different 



expression levels. More than 80% genes are medium expressed. 
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Only the uniquely mapped reads were used for further 
analysis, most of which (66-77%) were mapped onto 
exons. To examine the influence of this mapping discre- 
pancy, we conducted a saturation experiment. As shown 
in Additional File 1, the data set with the fewest mapped 
reads, UMC1, had a saturation curve fairly close to the 
horizontal line. Thus, the transcriptome sequencing was 
deep enough and the discrepancy between samples can be 
eliminated after normalization. 

Read densities for each gene were calculated by the 
number of uniquely mapped reads per kb per million 
mapped reads (RPKM), and over 13,000 mRNAs were 
identified by gene annotation (Table IB). Hierarchical 
clustering of gene expression data showed that aneuploid 
samples exhibit similar expression profiles (Figure 1), 
whereas euploid iPSC clones generated from UMC were 
most similar to each other. The expression differences 
between aneuploid and euploid iPSC clones were minor 
on a global scale, which agrees with the published micro- 
array data showing that Turner syndrome iPSCs exhibited 
clustering isolated from normal iPSCs with minor 



discrepancies [17]. To further investigate the expression 
differences between aneuploid and euploid iPSCs, we 
calculated Pearson's correlation coefficients between the 
six cell lines. The scatter plots between all aneuploid 
iPSCs are presented in Figure 2, and the scatter plots of 
UMC1 and UMC6 are presented in Additional File 2. The 
correlation analysis showed that the expression differences 
at the whole transcriptome level are not significantly 
different between aneuploid and euploid clones (Table 2). 

Differential gene expression between aneuploid and 
euploid iPSCs 

We considered a gene to be significantly DE between two 
iPSC lines if P-values and Q-values of DEGseq results 
were both less than 0.05. If one gene is both up-regulated 
or both down-regulated in UMC1 and UMC6 compared 
to one aneuploid cell line, it is classified as a "both" 
up-regulated or down-regulated gene. We find that more 
than 60% of up- or down-regulated genes in aneuploid 
clones are "both" up- or down-regulated genes, confirming 
the differences between aneuploid and euploid iPSC 
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Figure 1 Hierarchical clustering results of gene expression data. Four aneuploid iPSC lines are trisomy 8 (T8), trisomy 13 (T1 3), partial 
trisomy 1 1:22 (T22), and monosomy x (XO). Two euploid iPSC lines are UMC1 and UMC6. Columns represent cell lines and rows represent 
genes. Fold change values compared to mock are represented using log2 expression according to the color key on the right. 
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Figure 2 Pearson's correlation coefficient scatter plots between all aneuploid iPSCs 
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clones. The numbers of up- and down-regulated genes in 
each aneuploid line were generally similar. There were 
more up-regulated genes in trisomy 8 and trisomy 13, 
whereas there were more down-regulated genes in trisomy 
22 and monosomy x (Figure 3A, Additional File 3). 
Compared to previous transcriptome analysis of trisomy 
13 and trisomy 8 with DNA microarray [19,20], RNA-seq 
data detects more signal of expressed genes. Thus, micro- 
array results may not be able to reliably identify differential 
expressed genes with small fold change [27], while RNA- 
seq technology perform excellently in measuring gene 
expression levels with enough depth and sensitivity [28] . 

Table 2 Pearson's correlation coefficients between all 
aneuploid and euploid iPSCs. 



UMC1 



UMC6 



T8 



T13 



T22 



XO 



UMC1 


1 










UMC6 


0.967 


1 








T8 


0.939 


0.942 


1 






T13 


0.924 


0.942 


0.947 


1 




T22 


0.939 


0.925 


0.924 


0.930 


1 


XO 


0.931 


0.935 


0.932 


0.939 


0.935 1 



We used a more stringent fold change cut-off value to 
define DE genes between aneuploid and euploid iPSC 
clones (Figure 3B) and found that the number of DE 
genes between aneuploid and euploid clones was 
decreasing dramatically. With a fold-change cut-off of 
1.5, 26-34% of expressed genes were DE, which 
decreases to only 6-8% with a fold-change cut-off of 3 
and falls even further to 3-4% with a fold-change cut-off 
of 5. These results confirm that aneuploidy has a dosage 
effect on gene expression levels. We selected two genes, 
SLC25A6 (Solute Carrier Family 25, Member 6) and 
PRKX (protein kinase X), to validate our RNA-seq 
results by quantitative PCR (qPCR) in XO cell line and 
euploid cell line. The relative expression levels of both 
genes are nearly 2 fold in euploid sample than in XO, 
which is in accordance with the differential level of gene 
expression by RNA sequencing (Figure 4). 

We examined the up or down regulation of all expressed 
genes on each chromosome, and we found that transcrip- 
tome regulation is ubiquitous on all chromosomes not just 
on the extra chromosome or single remaining chromo- 
some (Figure 5). In the four aneuploid cell lines, 8-20% of 
genes on each chromosome were up regulated, whereas 
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Figure 3 DE genes amount between euploid and aneuploid iPSCs. The threshold is P value<0.05, and Q value<0.05. A. DE genes are 
classified into up- or down-regulated genes with fold change>1.5. B. DE genes are calculated with different cut-offs of fold change (FC). 



the percentage of down- regulated genes varied between 
5% and 24%, a slightly wider range than for up regulation. 
The exceptions were chromosome 19 in all four aneuploid 
lines, chromosome 3 in trisomy 22, and chromosome 10 
in trisomy 8. The exceptional performance of gene expres- 
sion regulation on chromosome 19 was very similar 
among the four aneuploid samples, with less than 10% of 
genes up-regulated and more than 20% down-regulated 
(as high as 35% in monosomy X). For chromosome 3 in 
trisomy 22, a very low percentage (only 0.7%) of genes 
were up regulated, whereas more than 70% of genes on 
the same chromosome were down regulated. A similar 



situation occurred on chromosome 10 in trisomy 8, with 
only 2.9% genes down regulated and 43.3% up regulated. 
Notably, the ratio of down-regulated genes on each chro- 
mosome of monosomy x is much higher than those on 
other three aneuploid cell lines, which may be caused by 
the loss of an x chromosome. 

Functional profiling of DE genes 

The presence of an extra chromosome or absence of a 
missing chromosome has various molecular effects on 
aneuploid individuals during fetal development. In order 
to explore the connection between the functional 
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Figure 4 Validation of RNA-seq results by qPCR. SLC25A6 and PRKX are selected and performed qPCR experiments for the validation of 
mRNA expression. The relative expression values of SLC25A6 and PRKX are nearly 2 fold in euploid sample than in XO. Values are referred to the 
respective iPS cell lines. 



categories of DE genes and the symptoms of aneuploidy 
syndromes, we sought to elucidate common regulatory 
patterns among aneuploid iPSC lines. Functional clus- 
tering analysis of DE genes between each aneuploid line 
and the two euploid controls was performed using Data- 
base for Annotation, Visualization, and Integrated Dis- 
covery (DAVID) [29]. Following the online instructions 
provided by DAVID, we examined KEGG pathways and 
Gene Ontology (GO) terms with P-values less than 0.05 
and gene counts more than 2. We identified 28 KEGG 
pathways for trisomy 8, 19 KEGG pathways for trisomy 
13, 23 KEGG pathways for trisomy 22, and 18 KEGG 
pathways for monosomy X. There are nine pathways 
appeared in all four aneuploid cell lines: axon guidance, 



calcium signaling, focal adhesion, ribosome, MAPK sig- 
naling pathway, p53 signaling pathway, vascular smooth 
muscle contraction, pathways in cancer and basal cell 
carcinoma (Figure 6). GO terms found in all four aneu- 
ploid lines are shown in Additional File 4. The biological 
processes of GO terms in all aneuploid cell lines were 
related to ion transmission, central nervous system, reg- 
ulation of apoptosis and cell proliferation which is con- 
sistent with those identified KEGG pathways. 

Due to the exceptional performance of gene expression 
regulation on chromosome 3 in trisomy 22 and chromo- 
some 10 in trisomy 8, we performed a functional cluster- 
ing compared with chromosome 22 in trisomy 22 and 
chromosome 8 in trisomy 8, using chromosome 1 as 
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Figure 5 Gene regulation distribution on each chromosome. Percentages of DE genes out of all expressed genes on each chromosome are 
shown as up-regulated part (A) and down-regulated part (B). On y-axis, breaks in scale are introduced because of the high percentage of 
chromosome 10 in T8 and chromosome 3 in T22. 



a control. We found 8 overlapping KEGG/GO terms 
between chromosome 3 and chromosome 22, out of 17 
terms for chromosome 22 in trisomy 22, which indicates 
there is a functional connection between the chromosome 
with abnormally regulated genes, chromosome 3, and the 
extra chromosome, chromosome 22, in trisomy 22. How- 
ever, we did not find any overlapping KEGG/GO terms 
between chromosome 8 and chromosome 10 in trisomy 8, 
probably because there are only 4 terms in chromosome 8 
in trisomy 8. KEGG/GO terms of DE genes on chromo- 
some 3 in trisomy 22 and on chromosome 10 in trisomy 8 
are listed in Additional File 5. 

In addition, we performed pathway analysis for DE 
genes in the four aneuploid iPSC lines relative to euploid 
controls using GeneGo Pathway tool (Additional File 6). 
Among the common functional groups, pathways related 
to development appeared to be the predominate group 
among all functional categories. Cell adhesion, cytoskele- 
ton remodeling, and immune response were also the main 
functional groups identified in each cell line. 

Discussion 

An extra or missing chromosome has global effects on 
gene expression 

Our study provides a comprehensive understanding of 
four human aneuploid induced pluripotent stem cell 
(iPSC) lines and two human euploid iPSC lines by tran- 
scriptome profiling with high-throughput next generation 
sequencing to obtain datasets of differential expression 
genes. Noteworthy, most published works of aneuploidy 
gene expression analyses have relied on DNA microarray 
techniques [8,10,11,13,19], a methodology based on 



hybridization, with well-known limitations such as worse 
sensitivity on low expression genes. Here the application 
of next generation sequencing technologies on quantifying 
gene expression levels help us to better understand the 
complexity of aneuploidy gene expression patterns as well 
as the relationship between gene expression and patholo- 
gical phenotypes. 

To investigate how an extra or missing chromosome 
affects gene expression in aneuploid cells, Mao and collea- 
gues measured the expression of transcripts in different 
tissue/cell types of trisomy 21 and found that only chro- 
mosome 21 shows significant differential expression rela- 
tive to euploid controls [8]. Similarly, Hisakatsu and 
colleagues generated artificial trisomy 8 cells and analyzed 
the gene expression profiles by microarray data. They 
found higher average gene expression on the additional 
chromosome 8 but lower average gene expression levels 
on all non-trisomic chromosomes [20]. However, David 
and colleagues presented transcriptome analyses of human 
fetal cells from pregnancies affected with trisomy 21/13 
and trisomy 18 amniocyte cells, and the relative expression 
levels between chromosomes showed a stable pattern with 
no significant differences between individual RNA samples 
in microarray experiments [19]. Due to the relatively high 
uncertainty of microarray methodology, the discrepancies 
between these expression patterns may have resulted from 
the differences of specific operations or the selected 
tissues. Based on the high quality next generation whole 
transcript sequencing results in this study, we propose 
that an extra or missing chromosome has extensive effects 
on the whole transcriptome. We have measured the gene 
expression profiles deeply enough in three trisomy and 
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Figure 6 Clustered heatmap of KEGG pathway enrichment analysis. Pathways found in more than 2 aneuploid cell lines are shown. The 
color intensities indicate enrichment score of each KEGG pathway. 




XO iPSC lines to demonstrate that gene expression regu- 
lation occurs on every chromosome of each aneuploid 
sample (Figure 5). The percentage of differentially regu- 
lated genes on the aneuploid chromosome was not signifi- 
cantly different from other diploid chromosomes. A 
possible reason is that DE genes on the extra or missing 
chromosome influence the gene expression regulation on 
other chromosomes. Likewise, another recent work that 



using microarray to estimate gene expression value regu- 
lated by artificial aneuploidy indicates that the gain of a 
single chromosome can indeed result in the up or down 
regulation of 140-202 genes with only 5-20% of up or 
down regulated genes located on the extra chromosome 
[30]. Notably, in each aneuploid cell line, less than one- 
third of expressed genes on the particular aneuploid chro- 
mosome are up- or down-regulated when the fold-change 
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cut-off value is set to 1.5 (Figure 5). To a certain extent, 
this is credible evidence for dosage compensation. Dosage 
compensation is commonly observed for sex chromo- 
somes according to previous studies, and it could have a 
similar influence on the aneuploid chromosome. Other- 
wise this phenomenon could be attributed to a buffering 
and feedback mechanism. 

Aneuploidy mainly affects the development of nervous 
system 

We investigated how an extra or missing chromosome 
leads to molecular effects on aneuploid individuals during 
fetal development by profiling the whole transcriptomes of 
iPSCs derived from aneuploidy syndromes. In addition to 
identifying genes relevant to the aneuploid phenotype, we 
used functional profiling to identify significantly disrupted 
biological pathways. Nine KEGG pathways were identified 
in all four aneuploid iPSC lines: axon guidance, calcium 
signaling, focal adhesion, ribosome, MAPK signaling 
pathway, p53 signaling pathway, vascular smooth muscle 
contraction, pathways in cancer and basal cell carcinoma. 
The top three are all associated with nervous system 
development. 

Axon guidance is an important process in the develop- 
ment of central nervous system, in which attractive and 
repulsive guidance cues steer axons in the growth cone 
along specific pathways [31]. There are several signaling 
pathways of guidance molecules, such as Slit-Robo and 
Eph/Ephrin, that are also included in the list of GeneGo 
pathways (Additional File 6). The Slit-Robo signaling 
pathway primarily provides important molecular cues for 
axon guidance during the assembly of the nervous system 
[32]. Recent research using Robo and Slit gene knockout 
mice has indicated that the Slit-Robo interaction is an 
integral factor during genesis of the corpus callosum [33] 
and the key genes of the Slit-Robo signaling pathway are 
also expressed in human fetal brain [34] . Agenesis of the 
corpus callosum was observed in mosaic trisomy 8 accord- 
ing to two case reports [35,36] and could be found in 19% 
of 63 individuals with trisomy 22 [37]. We believe the 
absence or hypoplastic state of the corpus callosum in 
aneuploid syndromes is related to the affected axon 
guidance due to misregulated genes in the Slit-Robo 
signaling pathway. On the other hand, Ephrin ligands and 
their cognate Eph receptors guide axons during neural 
development and are emerging as key players in synapse 
formation and plasticity in the central nervous system [38]. 
The central nervous system anomalies in trisomy 13 have 
been reported to include partial agenesis of the corpus 
callosum and neuronal heterotopias in the cerebellum 
[39] , and each aneuploidy shows deficiency in neurodeve- 
lopment to a different extent. 

The focal adhesion pathway is required for both attrac- 
tive and repulsive cues to guide axon to their specific 



targets during development of nervous system [40]. Focal 
adhesions may have other functions such as cytoskeletal 
dynamics control but they mainly affect trisomy pheno- 
type by influencing axon guidance pathways. Additionally, 
Slit-Robo and Eph/Ephrin, the two axon guidance path- 
ways, and the focal adhesion pathway all influence retinal 
development [41-43], which has been reported to lead to 
an abnormal phenotype in trisomy syndromes [44,45]. 

It is not surprising that a large number of misregulated 
genes are involved in the calcium signaling pathway, as 
this is the first messenger of signal transduction pathways. 
Ca 2+ signals affect axon guidance by mediating the reversal 
of neuronal migration induced by slit2 gene or pathway 
[46]. They also play a key role in regulating the neuronal 
growth cone while mediating growth and turning 
responses [47], which might be a minor cause of the cor- 
pus callosum agenesis observed in aneuploid syndromes. 
Calcium signaling pathway also contributes to phenotype 
of aneuploidy with another identified pathway, vascular 
smooth muscle contraction, which is directly influenced 
by calcium concentration [48]. Many cardiovascular are 
diseases are originating from abnormal function in vascu- 
lar smooth muscle, especially vascular hypertension. Some 
patients with Turner syndrome (TS) had been found with 
a higher cardiovascular morbidity [49,50], especially vascu- 
lar hypertension. Some patients with Turner syndrome 
(TS) had been found with a higher cardiovascular morbid- 
ity [49,50]. Alzheimer's disease might also arise in 
aneuploidy syndromes through the alterations of Ca 2+ 
levels to cause disturbances [51]. 

Aneuploidy and tumorigenesis 

Another set of enriched KEGG pathways in all 4 aneu- 
ploidy is related to cancer including P53 signaling, path- 
ways in cancer and basal cell carcinoma. The question of 
how aneuploidy affect cancer initiation and progression has 
been studied for over a century [52], though its genetic 
basis remains unclear. Most cancers contain cells that pos- 
sess a common characteristic of aneuploidy while abnormal 
number of chromosomes is essential for tumorigenesis [53]. 

Trisomy 8 and trisomy 13 has been reported to predis- 
pose neoplasms, mainly acute myeloid leukemia (AML), 
suggesting roles of an extra 8 or 13 chromosome in 
tumorigenesis [54,55]. It has been proved that trisomy 8 is 
the most frequent trisomy occurred in AML, which leads 
to tumor-specific gene-dosage effects such as significantly 
down-regulated apoptosis-regulating genes [56]. Although 
there is no explicit association between x chromosome 
genes and neoplasm, basal cell carcinoma, another enrich- 
ment pathway in our study, was diagnosed in a TS patient 
[57,58]. Several recent investigations showed that TS 
patients have significantly increased risks of tumor, 
especially in central nervous system, bladder and urethra 
[57-59]. 
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Cumulatively, our result together with current evidence 
suggests that besides multiple developmental abnormal- 
ities, aneuploidy associate with alterations in the risk for 
specific cancers. The extra or missing chromosome dis- 
rupts global transcription and promotes tumorigenesis 
effectively by disturbing cancer related pathways. However, 
the characteristic of lethality to aneuploidy increase the 
difficulty to investigate whether the gain or loss of a 
chromosome contributing to tumorigenesis by down- 
regulated the expression of tumor suppressor genes and/ 
or up-regulated the expression of oncogenes. Further 
molecular biological studies are needed to assess, and 
more clinical reports are needed to prove how aneuploidy 
affects tumorigenesis. 

As to the exceptional performance of more DE genes on 
chromosome 3 in trisomy 22, 4 KEGG pathways enrich on 
that chromosome, which are axon guidance, colorectal 
cancer, glycosaminoglycan degradation and endometrial 
cancer. Three of them are nervous system and tumorigen- 
esis related pathways. This might explain why the down- 
regulated genes on chromosome 3 in trisomy 22 are much 
more than those on other chromosomes. 

Our Gene Ontology analysis confirms our KEGG path- 
way results, especially with respect to nervous system 
development. The integrated results in this study demon- 
strate that genes involved in nervous system development 
and tumorigenesis are most affected pathologically in 
aneuploid individuals. Our results provide initial indica- 
tions of possible biological pathways affected by aneu- 
ploidy based on deeply transcriptome sequencing. In 
addition, we also offer a better understanding to of the 
early etiology of congenital anomalies, which may suggest 
promote future innovative approaches in health treatment. 

Conclusions 

Using next generation transcriptomics sequencing tech- 
nology RNA-seq, we profiled the transcriptomes of four 
human aneuploid induced pluripotent stem cell (iPSC) 
lines generated from monosomy x (Turner syndrome), 
trisomy 8 (Warkany syndrome 2), trisomy 13 (Patau syn- 
drome), and partial trisomy 11:22 (Emanuel syndrome) as 
well as two umbilical cord matrix iPSC lines as euploid 
controls. A total of 466 M (50-bp) reads were obtained 
from six iPSC lines, and over 13,000 mRNAs were identi- 
fied by gene annotation. Global analysis of gene expression 
profiles and functional analysis of differentially expressed 
(DE) genes were implemented to examine how phenotypic 
abnormalities develop with aberrant karyotype. Our results 
demonstrate that the extra or missing chromosome has 
extensive effects on the whole transcriptome. Functional 
analysis of differentially expressed genes reveals that the 
genes most affected in aneuploid individuals are related to 
central nervous system development and tomorigenesis. 



Methods 

Next-generation transcriptome sequencing and data 
processing 

All human iPSC clones presented here were obtained from 
the South China Institute for Stem Cell Biology and 
Regenerative Medicine, Guangzhou Institutes of Biomedi- 
cine and Health, and have been described before [17]. 
Library construction was based on a protocol described 
previously [60,61]. Total RNA of each line was extracted 
using TRIzol reagent (Invitrogen) according to the manu- 
facturer's instructions. Poly(A)+ mRNA was isolated from 
total RNA using Oligotex (QIAGEN). RNA was fragmen- 
ted with RNase III, preparing for constructing transcrip- 
tome libraries of each iPSC cell line. Applied Biosystems 
SOLiD Whole Transcriptome Analysis Kit (http://solid. 
appliedbiosystems.com) were applied to perform reversed 
transcription from 140-200 bp isolated RNA fragments 
into Single-strand cDNA. 

Sequence data were generated using SOLiD3 system 
(Applied Biosystems) following the manufacturer's instruc- 
tions. RNA-seq reads were mapped onto the human refer- 
ence genome (NCBI37/hgl9) with Corona_lite_v4.2.2 
software (Applied Biosystems), setting the parameters for 
full-length read mapping (50, 45, 40, 35 bp) with 5, 4, 4, 
and 3 mismatches. Only reads that uniquely mapped to 
the genome and reads for genes corresponding to mRNA 
were chosen for subsequent analysis. Reads density for 
each gene (shown as RPKM value) was calculated by the 
number of uniquely mapped. Hierarchical clustering 
was performed in R using the pheatmap package. 
Pearson correlation coefficients for each pair of iPSC 
lines were calculated using the log2 RPKM values via 
cor function in R. 

Detection of DE genes 

DE genes between aneuploid iPSCs (T8, T13, T22, and 
XO) and normal iPSCs (UMC1 and UMC6) were identi- 
fied by DEGseq, for which R-packages are available 
under Bioconductor (http://www.bioconductor.org/ 
packages/2.7). DEGseq is a free R package to detect DE 
genes between two samples with or without replicates of 
RNA sequencing data [62]. MA plot-based method 
(where M is the log ratio of the counts between two 
experimental conditions for each gene, and A is the two 
group average of the log concentrations of the gene) with 
a random sampling method (MARS) was selected. DE 
genes between 4 aneuploid and 2 euploid samples are 
calculated respectively. The raw count of each gene was 
used, and function DEGexp was performed for analysis. 
A gene was considered to be significantly DE if its 
P-value and Q-value were both less than 0.05. For each 
gene, the level of change in expression is stated as a fold- 
change. 



Zhang et al. BMC Genomics 2013, 14(Suppl 5):S8 
http://www.biomedcentral.eom/1 471 -2 1 64/1 4/S5/S8 



Page 11 of 13 



Functional profiling of DE genes 

The Database for Annotation, Visualization, and Integrated 
Discovery (DAVID) was used to identify KEGG pathways 
and enriched gene ontology categories of DE genes [29]. 
Here DE genes between 4 aneuploid samples and UMC1/ 
UMC6 are calculated respectively, then those DE genes 
expressed in UMC1 or UMC6 are selected to be DAVID 
input datasets. Following the instructions of DAVID man- 
ual, datasets of each sample were uploaded and the func- 
tion charts were generated. The functional groups with a 
P-value less than 0.05 and gene counts greater than 2 were 
examined. Pathway maps of a manually curated proprietary 
database (MetaCore™, GeneGo, St. Joseph, MI) were used 
for pathway analysis of DE gene between different samples. 
According to the P-value of each pathway, we chose the 
first 50 pathways of each gene set. 

Availability of supporting data 

The data used in this study is available at the NCBI GEO 
database (http://www.ncbi.nlm.nih.gov/projects/geo, 
accession number GSE49247) 
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Additional File 1: Saturation curves of UMC1 Number of expressed 
genes (blue curve) and correlation of expression (red curve) are plotted 
with sequencing depth. Only mRNAs are selected for further analysis. 

Additional File 2: Pearson's correlation coefficient scatter plots 
between two euploid iPSCs, UMC1 and UMC6 

Additional File 3: List of DE genes between each aneuploidy and 
euploid iPSCs. DE genes by DEGseq between each aneuploidy and 
euploid iPSCs are list in the table with p-value<0.05, q-value <0.05 and 
fold change >=1 .5. 

Additional File 4: Clustered heatmap of GO enrichment analysis GO 

terms found in all four aneuploid cell lines are shown. The color 
intensities indicate enrichment score of each GO term. 

Additional File 5: KEGG/GO terms of DE genes on chromosome 3 in 
trisomy 22 and on chromosome 10 in trisomy 8 KEGG/GO terms 
found in DE genes chromosome 3 in trisomy 22 and on chromosome 10 
in trisomy 8 with p-value<0.05 and counts >2. 

Additional File 6: Pathway analysis using GeneGo Pathway tool 

Each number represents the amount of functional terms found in each 
functional group. 
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